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Abstract 

In this paper, we propose a general framework for the asymptotic analysis of node-based verification-based algorithms. In our 
analysis we tend the signal length n to infinity. We also let the number of non-zero elements of the signal k scale linearly with 
n. Using the proposed framework, we study the asymptotic behavior of the recovery algorithms over random sparse matrices 
. - , (graphs) in the context of compressive sensing. Our analysis shows that there exists a success threshold on the density ratio fc/n, 

■ before which the recovery algorithms are successful, and beyond which they fail. This threshold is a function of both the graph 
• and the recovery algorithm. We also demonstrate that there is a good agreement between the asymptotic behavior of recovery 
' algorithms and finite length simulations for moderately large values of n. 

cni : 

3 I I. Introduction 

, Compressive sensing was introduced with the idea to represent a signal y_ E M" having k non-zero elements with 

' measurements C_ G M™, where k < m <^ n and yet be able to recover the original signal V_ back [I], IS). In the measuring 

T— I \ process, also referred to as encoding, signal elements are mapped to measurements through a linear transformation represented 

■ by the matrix multiplication C_ ~ YG, where the matrix G G M"^™ is referred to as the sensing matrix. This linear mapping 
' can also be characterized by a bipartite graph fS], referred to as the sensing graph. 

I_H In the recovery process, also referred to as decoding, based on the knowledge of the measurements and sensing matrix, 
] we try to estimate the original signal. For given n, to, and k, a decoder is called successful if it recovers the original signal 
^ thoroughly. Two performance measures namely, density ratio "f = k/n and oversampling ratio Tq = m/k are used in order to 

' measure and compare the performance of the recovery algorithms in the context of compressive sensing 
I i: Researchers have worked intensively in the following main areas: 1) designing G for given k and n, in order to reduce the 
' number of measurements to required for a successful recovery, 2) improving the recovery algorithms for given n and to to be 
able to reconstruct signals with larger density ratio, i.e., signals with more non-zero components, and 3) analyzing performance 
measures of different recovery algorithms in the asymptotic regime (as n oo) in order to compare different algorithms and 
be able to give an estimate of the performance for finite n. 
• ^ Donoho in |1J and Candes et. al. in jl] used sensing matrices with i.i.d. Gaussian entries and the £i norm minimization 
j <~>) . of the signal estimate as the reconstruction method. Their random sensing matrices contain mostly non-zero elements, which 
] makes the encoding computationally intense. Inspired by the good performance of sparse matrices in channel coding, some 
T— I . researchers (e.g. Q and 0) used sparse manices, as the sensing matrix. 

^ ■ From the viewpoint of recovery complexity, the £i minimization algorithm has a computational complexity of O(n^). To 
• ^ ■ reduce this complexity, some researchers used iterative algorithms as the decoder For example, the authors in ||5l used an 
iterative algorithm with a computational complexity of 0{n- logn) over regular bipartite graphs, while Xu and Hassibi in 13] 
discussed a different iterative algorithm with a complexity of 0{n) based on a class of sparse graphs called expander graphs Q. 
Authors in IS) and ||9l proposed and analyzed an iterative thresholding algorithm over dense graphs with complexity between 
O(nlogn) and 0{n^), depending on the sensing matrix used. The two verification-based (VB) iterative algorithms originally 
proposed by ifTOl in the context of channel coding, were analyzed in ifm - lfTSl for the case when fc/n — s> as 7i — s> oo in the 
context of compressed sensing. In 1141 and ITSl asymptotic analysis of some iterative message-passing algorithms over sparse 
sensing matrices can be found. The sensing matrices used in ITTI - lfTSl are all sparse. 

Our main goal in this paper is to develop a framework for the asymptotic analysis (as n — !> oo) of VB algorithms over 
sparse random sensing matrices and extend it to include recovery algorithms of similar nature such as [[3]. In our work we 
show that the overall computational complexity of the analysis is linear in the number of iterations. We will also show, through 
simulation, that VB algorithms when applied to signals with moderate lengths (in the order of 10^), are in good agreement with 
the asymptotic results. Using our approach we can perform a comprehensive study and comparison of performance/complexity 
trade-off of different VB recovery algorithms over a variety of sparse graphs. 

The rest of the paper is organized as follows. In section HI] we present notations, definitions and assumptions used throughout 
the paper We will also introduce VB algorithms in more detail. In section |III] the encoding process and input disnibutions are 
described. Decoding algorithms are described in section |IV] The analysis framework and its generalization will be introduced 
in sections |V] and |VI] respectively. Simulation results will be presented in section IVIII 



'For successful decoding clearly we need r-g > 1. It is desirable to have this parameter as small as possible. Indeed, in the asymptotic case (n — > oo), 
To = 1 is achievable. This has been proved in j4j- 



II. Background 

A. Bipartite Sensing Graph 

In general, the sensing matrix G can be thought as the weighted incidence matrix of a weighted bipartite graph. In this case, 
the element in row i and column j of G is the coefficient of the z* signal element {vi) in the linear combination resulting the 
j* measurement Cj. If the weights are all 1 (G M), then G will reduce to the incidence matrix of a bipartite graph. 

Consider a bipartite graph with node sets V and C. Following channel coding terminology, we will call V the set of variable 
nodes and C the set of check nodes. In the compressive sensing context, signal components and measurements are mapped to 
variable nodes and check nodes, respectively. We will interchangeably use the terms variable nodes and signal elements as 
well as check nodes and measurements. 

In regular bipartite graphs, each variable node (check node) is incident to the same number (dc) of check nodes (variable 
nodes). The numbers d„ and dc are called variable node degree and check node degree, respectively. All graphs discussed in 
this paper are sparse regular bipartite graphs denoted by the pair {d„, dc) and simply referred to as graphs. 

For a variable node vi we use the notation M.{vi) G C to denote the set of check nodes incident to it. The graph composed 
of a subset V* of variable nodes, their neighboring check nodes J\A{V*) and all the edges in between is called the subgraph 
induced by V*. 

B. Verification Based Algorithms 

Two iterative algorithms over bipartite graphs are proposed and analyzed in ifTOl for packet-based error correction in the 
context of channel coding. In these algorithms, a variable node can be in one of the two states: "verified" or "unknown". 
Under certain circumstances, a variable node is verified and a value is assigned to it. This variable node, then contributes to the 
verification of other nodes. The decoding process continues until either the unknown variable nodes become verified entirely, 
or the process makes no further progress. Due to the verification nature of the procedure, the two algorithms in ilOi are called 
verification-based (VB) algorithms. When used in the context of compressive sensing, we would like to see VB algorithms to 
correctly verify signal elements in each iteration. Indeed, in section [nil we define sufficient conditions for VB algorithms to 
result in the original signal. 

As noted in |[T3l . authors in ifTOll defined the two VB algorithms using node-based (NB) representation but analyzed them 
using message-based (MB) representation. In the NB representation, the "verified" state of a variable node is a property of the 
node itself. In the MB representation, however, the state is reflected in the outgoing messages from a variable node. Therefore, 
in contrast to the NB case, multiple different states may exist for the same variable node. In flJl . authors showed that for one 
of the algorithms, the two versions perform the same. But for the other algorithm, the NB version outperforms the MB one 
(in compressive sensing this implies that NB version can successfully recover signals with larger density ratios). 

A well-known method to analyze such iterative algorithms in coding theory is density evolution ifTSl . However, as density 
evolution can only be applied to the MB representations, authors in ifTSl used differential equations to analyze the NB versions in 
the case where n — >■ oo. Applying their analysis to (dy, dc) graphs, the number of differential equations is roughly (d^ + 3dc)/2, 
which becomes intractable for large dc. Therefore, authors used numerical calculations to see the success/failure of the NB 
algorithms. 

In the context of compressive sensing, authors in ifTTI analyzed the MB-VB algorithms using density evolution for super- 
sparse signals (fc/n — > as n — > oo). In our work, we analyze NB-VB algorithms in the regime where n — > c» and k grows 
linearly with n. In section [V] we show that the complexity of our methodology is less than the one used in lfT3l . 

III. Encoding and Input Distribution 

Let K. denote the set of non-zero elements in the original signal. We refer to this set as the support set. In general, there 
are two ways to define signal elements in compressive sensing: 

1) Let k — \K.\ = 771 be a deterministic value. Out of ?? signal elements, k of them are selected at random as the support 
set. The value of each non-zero element is then an i.i.d. random variable with probability distribution /. 

2) Let a, referred to as density factor, be the probability that a signal element belongs to JC. By fixing a, each of the n 
signal elements has a value i.i.d. with the following distribution: it is zero with probability 1 — a or follows a distribution 
/ with probability a. In this case, k and 7 = k/n are random variables. Furthermore, £^[7] = a and E[k] ~ an, where 
E[-] denotes the expected value. 

When 71 — > cxD, as a consequence of law of large numbers, both cases (1) and (2) provide the same results. In the rest of the 
paper, we adopt the second model. In this paper, we show that using NB-VB recovery algorithms in the asymptotic regime as 
7t — !• cx), a limiting value exists for a., before which the recovery algorithm is successful and beyond which it is not. Henceforth, 
we refer to this limit as success threshold. 

The weights of the bipartite graph, corresponding to the non-zero entries of the sensing matrix G, can be drawn i.i.d. 
according to a distribution g. In this work, we make the assumption that at least one of the distributions / or g is continuous. 
Similar conditions have been used in ||6l and Q. As a consequence, we introduce and prove Theorem [T] below. 



Theorem 1. Let ct and Cj be two distinct check nodes and V,; and Vj be their corresponding set of incident variable nodes in 
IC; i.e., Vi ~ A^(ci) H/C and Vj = Ai{cj) nJC. Suppose at least one of the distributions f or g described before is continuous. 
If Ci = then Vi is the empty set with probability one. Moreover if Vi ^ Vj then: 

Pr(c, -c,)-0. 

Before proving the theorem, let's state the Uniqueness of Samples fact, which is used in the proof. 

Fact (Uniqueness of Samples). Let Xi and Xj be two independent samples drawn from a continuous distribution. It follows 
that: 

Pr (a;,; = Xj) = 0. 

In other words, no two independent continuous samples will have the same value, almost surely. More generally, if c denotes 
any constant, then 

Pr {x, = c)^0. 

Proof of TheoremU} The value of a check node Cj is J2i-vieM{c) '^ij^i^ where Wij is the weight associated with the 
edge connecting the variable node Vi to check node Cj. Thus, a check node will have a continuous distribution whenever at 
least one of its neighboring variable nodes belongs to the support set, and will be zero otherwise. The proof is then complete 
according to the Uniqueness of Samples fact. ■ 

Based on Theorem [T] the following statements are correct with probability one (almost surely): 
SI: if two check nodes Ci and Cj have the same non-zero value, they are both neighbor to the same elements of the set JC, 

i.e., {M{c,) n /C} = {M{c.j) n /C}. 
S2: if the value of a check node is zero, none of its neighboring variable nodes belongs to the set /C, i.e., {A^(ci) fl/C} = 0. 
In VB algorithms, as we will see in the next section, variable nodes are verified based on similar statements as SI and S2. 
Therefore, the assumption on the continuity of / or g is a sufficient condition for the algorithms to converge to the true original 
signal. Henceforth, we assume that all the weights of the bipartite graph (and therefore the entries of G) are in {0, 1}, and the 
distribution / is continuous over M. 

IV. Decoding Process and Recovery Algorithms 

The decoder, knowing the measurements and the sensing matrix, tries to recover the original signal. So, neither the density 
factor a nor the support set is known at the decoder. 

In this section we discuss the first algorithm (LMl) used in ifTTI (here referred to as LM) and the algorithm used in fS) 
(referred to as SBB). These two algorithms are the original VB algorithms in the context of compressive sensing. With the 
description given in section ITl-BI the algorithm in ||3], referred to as XH, falls in the category of VB algorithms as well. In 
the original XH, at each iteration, only one variable node is verified. Here we propose and discuss a parallel version of this 
algorithm. For the case where n — > cxd and dy > 5, analyzing the set of all variable nodes that can be verified potentially at 
each iteration of the original XH, it can be shown that the verification of one variable node does not exclude another variable 
node from this set. Therefore, both versions of XH perform identically in terms of success threshold. The parallel version, 
however, is considerably faster 

As the last algorithm, we reveal the support set to the decoder and use the conventional peeling decoder in iflTl . We will 
refer to this algorithm as Genie. The Genie performance will be an upper bound on the performance of VB algorithm^ 

The description of these four algorithms follows next. Except for the Genie, all variable nodes are initially "unknown". Before 
the first iteration, all variable nodes that have at least one neighboring check node with the value equal to 0, are removed 
from the graph. In each iteration of the four algorithms, when a variable node is verified, its verified value is subtracted from 
the value of all neighboring check node. The node is then removed from the graph along with all edges adjacent to it. Check 
nodes with degree are also removed from the graph. At any iteration, the algorithms stop if either all variable nodes are 
verified, or the algorithm makes no further progress. 

At each iteration £, the algorithms proceed as follows: 

LM 

• find the degree of each check node. Verify all variable nodes that have at least one check node of degree one with the 

value of singly-connected check nodes. 
SBB 

1) sequentially go through all variable nodes. For each variable node v, look for two check nodes Ci,Cj S M-{v),j ^ i 
with identical value g. 

2) Verify all variable nodes v' adjacent to either q or Cj, i.e., Vw' : v' e {A4{ci) U A4{cj)} — {A4{ci) n A4{cj)}, to zero. 



^The performance of Genie is the same as the performance of peeling algorithm over BEC. 



3) If V is the only variable node connected to both c; and Cj (v = {Ai{ci) D Ai{cj)}) verify it with the value g. 
For the sake of presentation we have presented simplified versions of LM and SBB algorithms. In section |Vll the necessary 
modifications that should be made in the analysis to deal with the original algorithms are discussed. 

XH 

• find all variable nodes Vi such that for each Vi, \dy/2] or more of its neighboring check nodes {Ai{vi)) have the same 
value gi. For each such variable node, verify Vi by the value gi. 

Genie Algorithm 

• in the subgraph induced by the unverified variable nodes in the set /C, verify all variable nodes that have at least one 
check node of degree one with the value of singly-connected check nodes. 

V. Asymptotic Analysis Framework 

To describe the analysis framework, we assume a {dy,dc) graph. Let V* C V be a subset of variable nodes and Q* be the 
left-regular graph induced by the set V*. We denote the set of check nodes with degree i,l < i < dc in Q* by Mi- This 
partitioning is depicted in Figure [T] for V* = /C. For mathematical convenience we let A/q be the set of check nodes that have 
been removed from the induced subgraph. Clearly, C = Uf=o-^»- Further, let C V*, 1 < i < d^ be the set of variable nodes 
that have i edges connected to the set A/i. Figure |2] shows the partitioning of V* = /C to A^s. 

At this stage, we model the verification process of Genie, LM, SBB and XH algorithms using the sets A^'s in the asymptotic 
regime where n ^ oo. This verification model is presented and proved in Theorem |2] In this theorem, K.' = ICU /Ca, where 
ICa is the set of zero-valued variable nodes, in which all variable nodes have d^ edges connected to the set A/g = Uf=i-^- 




Fig. 1: Partitioning check nodes based on tlieir degree in the Fig. 2: Partitioning variable nodes in JC based on the number of their 

subgraph induced by K. neighbors in A^i. 



Theorem 2. In each iteration, a variable node is verified asymptotically almost surely if and only if it belongs to the set 
Uf=,3 '^j' where f3 equals 1, 2, \dy/2 \ for the Genie, SBB and XH, respectively. In these cases V* = /C. 

In each iteration of the LM algorithm a variable node is verified asymptotically almost surely if and only if it belongs to the 
set yjl^i Xi. In this case V* = K' . 

Proof of Theorem |2} We first prove the theorem for the SBB algorithm. The proof can be used also for the XH algorithm 
with no major changes. A variable node v is resolved in the SBB algorithm if and only if it is the only unresolved variable node 
attached to at least two check nodes ci^ and Ci^ with the same value. If Cj^ , Ci^ G Ni, then by definition v E {X2, A3, • • • , X^^^ }. 
To prove the converse, assume that at iteration £, v G X2 for simplicity. The only way that this variable node is not resolved in 
this iteration is that it shares the two singly-connected check nodes with at least one zero-valued variable node v'. However, 
this means that the two variable nodes v, v' form a cycle of length 4. According to ifTSl . a random regular graph has a fixed 
number of short cycles regardless of its size. Thus, tending the number of variable nodes to infinity, the probability that two 
variable nodes v, v' form a cycle of length 4 goes to zero. In other words, variable node v E {X2, X3, ■ ■ ■ , X^^,} is resolved 
in the SBB algorithm asymptotically almost surely. This completes the proof. 

In the LM algorithm, after removing variable nodes with at least one check node with the value equal to zero, the remaining 
variable nodes in the graph are in the set /C'. This justifies the use of /C'. The rest of the proof follows as before. ■ 

Corollary 1. For a (dv,dc) graph, the success threshold is the highest for Genie, followed by SBB and lastly XH. This is 
because the number of Xi sets contributing to the verification of variable nodes decreases in the same order 

Corollary [T] is also verified in the simulation section [VIll Theorem |2] allows us to model the sensing graph and its evolution 
in the four algorithms with a graph induced by the support set /C or IC' along with the evolution of the sets X^ and A/} in the 
asymptotic regime. To formulate the evolution of the sets A/i and Xi, we denote by pj^^ (Px^)' '■^^ °f probabilities that a 
check node (variable node) belongs to the set Afi (Xi) at iteration £. The superscript {£) denotes the iteration number £. Also, 
we denote by a'^' the probability that a variable node belongs to the unverified set IC^^\ An iteration £ > 1 starts by knowing 
the probabilities a^^\ P^^. and P^^., continues by the calculation of a^^^^\ and ends with the calculation of P^Jj^^^ and P^^'^^\ 



Using this analysis approach we are able to track the evolution of a'^^) with iteration for a given initial density factor 
The analysis proceeds until either the probability a'^^ decreases monotonically to zero as the number of iterations increase, or 
it is bounded away from zero for any number of iterations. In the first case the algorithm succeeds in recovering the original 
signal entirely, while in the second case it fails. By examining different values of a'^^^ the success threshold, defined as the 
supremum value of for which the signal can be fully recovered as ri — > oo and £ — > oo, can be determined for different 
{dy.dc) pairs. 

In what follows, we show the algorithm to find the update rules for different probabilities. The formulas are calculated using 
combinatorial enumeration and probabilistic arguments. The proofs can be found in Appendix lAl 

1) Based on the set of probabilities find the probability p''r \ that a variable node is verified in iteration t + 1 from 
Pr^ = llt=pPx]- The value of (3 can be 1, or 2, or [(iu/2] as in Theorem|2] The probability a*^^+^' then follows from 

a(^+i) 

2) Find the set of probabilities P']^^\j = 0, ■ • • ,dc from Pfj^'^^ = Yli=j PAf P'j^- where: 



dvPj^[ V J/ 

3) Find the set of probabiUties , i = 0, • • • , from ^ J^'jtT''^'^'' Px]p% ' where: 



The quantities A and B are given by: 



A=—, ^4^, B 




j=2 i=2 



The initial probabilities P^^. and P^'' for Genie, SBB and XH are as follows. These probabihties for the LM are rather more 
involved and can be seen in Appendix lAl 

^ rj \ / , .\ i / , , \ dn—i 



where, p^") =p^^Ja^^)dc, and — a (density factor). The number of update rules in each iteration, is almost the same as 
the one in ifTSl . Therefore, the overall computational complexity of the analysis is linear in £. However, the calculation of the 
updates is based on simple calculations as opposed to solving differential equations as in ifTSll . 

VI. Generalization of the Framework 

The extra step in the original LM and SBB algorithms is as follows: at each iteration, look for check nodes with value equal 
to zero. If such a check node is found, verify the neighboring variable nodes with the value zero. 

To analyze the original algorithms, the set of variable nodes is divided in two sets /C and K,' as in the simplified LM. The 
set of check nodes is then categorized into sets A/ij , where i and j indicate the number of edges between the check node and 
the sets /C and /C', respectively. The recursive formulas for the new setup can be found using the same methodology as before. 

VII. Simulation Results 

In this section, we present simulation results obtained by running the recovery algorithms over random regular bipartite 
graphs to recover sparse signals of finite length n. We also present analytical results obtained by running the mathematical 
analysis described in Section |V] for the asymptotic regime of n — > oo. The comparison of the results shows that there is a 
good agreement between simulation and analytical results for moderately large n =10^. 

In all simulations, signal elements (variables) are drawn from a Gaussian distribution. The regular bipartite graphs are 
constructed randomly with no parallel edges and all the edge weights equal to one. Each simulation point is generated by 
averaging over 1000 random instances of the input signal. 



For the analytical results, we consider the recovery algorithm successful if a^^^ < 10^^. To calculate the success threshold, 
a binary search is performed until the separation between start and end of the search region is less than 10~^. 

As the first experiment, we apply XH, SBB and LM algorithms to four randomly constructed (5,6) regular graphs with 
n = {3, 15, 100, 1000} x 10"^. The success rate of the algorithms vs. the initial density factor a'"-* are shown in Figure |3] 
From the figure, we can see that, for all algorithms, by increasing n, the transition part of the curves becomes sharper such 
that the curves for n ~ 10® practically look like a step function. In the figure we have also given the success threshold of the 
algorithms for (5, 6) graphs, obtained based on the proposed analysis, as arrows. As can be seen, the thresholds match very 
well with the waterfall region of the simulation curves. 

In Table |I] we have listed the analytical success thresholds of the iterative recovery algorithms for graphs with different d„ 
and dc- The results for XH and SBB algorithms on (3, 4) graphs are missing as these algorithms perform poorly on such graphs. 
As expected, for every graph, the Genie algorithm has the best performance. This is followed by SBB, LM and XH algorithms, 
respectively. Careful inspection of the results in Table |T] indicates that the oversampling ratio Tq, improves consistently by 
decreasing both dy and dc values. In fact, among the results presented in Table U the application of the Genie and LM to 
(3, 4) graphs results in the lowest oversampling ratio {tq = dy/adc) of « 1.16 and « 2.51, respectively. Note that the success 
threshold of the Genie over regular graphs is far from the optimal achievable success threshold dy/dc proved in 14J. 

To further investigate the degree of agreement between our asymptotic theoretical analysis and finite-length simulation results, 
we have presented in Figure |4]the evolution of density factor a'^' with iterations £ for the four algorithms over a (5,6) graph 
with n = 10^. For each algorithm, two values of a'"' are selected: one above and one below the success threshold presented in 
Table H] The theoretical results are shown by solid lines while simulations are presented with dotted lines. As one can see, the 
two sets of results are in close agreement particularly for the cases where a'"^ is above the threshold and for smaller values 
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Fig. 4: Evolution of a^^^ vs. iteration number £ for tlie four recovery algorithms over a (5, 6) graph with n = lOOK. 



TABLE I: Success Thresholds for different graphs and algorithms 



{dv,dc) 


(3,4) 


(5,6) 


(5,7) 


(5,8) 


(7,8) 


XH 




0.1846 


0.1552 


0.1339 


0.1435 


SBB 




0.3271 


0.2783 


0.2421 


0.3057 


LM 


0.2993 


0.2541 


0.2011 


0.1646 


0.2127 


Genie 


0.6474 


0.5509 


0.4786 


0.4224 


0.4708 



References 

[1] D. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289-1306, April 2006. 

[2] E. Candes, J. Romberg, and T. Tao, "Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information," IEEE 

Transactions on Information Theory, vol. 52, no. 2, pp. 489-509, February 2006. 
[3] W. Xu and B. Hassibi, "Efficient compressive sensing with deterministic guarantees using expander graphs," Information Theory Workshop (ITW), pp. 

414-^19, September 2007. 

[4] Y. We and S. Verdii, "Fundamental limits of almost lossless analog compression," International Symposium on Information Theory (ISIT), pp. 359 - 
363, 2009. 

[5] S. Sarvotham, D. Baron, and R. Baraniuk, "Sudocodes - fast measurement and reconstruction of sparse signals," IEEE International Symposium on 

Information Theory, pp. 2804-2808, July 2006. 
[6] F. Zhang and H. Pfister, "Compressed sensing and hnear codes over real numbers," Proceedings Workshop on Information Theory and Application, pp. 

558-561, February 2008. 

[7] S. Hoory, N. Linial, and A. Wigderson, "Expander graphs and their application," Bulletin of the American Mathematical Society, August 2006. 
[8] D. L. Donoho, A. Maleki, and A. Montanari, "Message passing algorithms for compressed sensing: I. motivation and construction," arXiv, 2009. 
[Onhne]. Available: http://arxiv.org/pdf/0911.45l9] _ 

[9] , "Message passing algorithms for compressed sensing: li. analysis and validation," arXiv, 2009. [Online]. Available: http://arxiv.org/pdf/0911.4222 

[10] M. Luby and M. Mitzenmacher, "Verification-based decoding for packet-based low-density parity-check codes," IEEE Transactions on Information 

Theory, vol. 51, no. 1, pp. 120-127, January 2005. 
[11] F. Zhang and H. D. Pfister, "Verification decoding of high-rate Idpc codes with applications in compressed sensing." [Online]. Available: 
"http://arxiv.org/pdf/0903.2232 

[12] , "List-message passing achieves capacity on the q-ary symmetric channel for large q," IEEE Global Telecommunications Conference GLOBECOM, 

pp. 283-287, November 2007. 

[13] , "List-message passing achieves capacity on the q-ary symmetric channel for large q." [Online]. Available: |http://arxiv.org/pdf/0806.3243 1 

[14] Y. Lu, A. Montanari, B. Prabhakar, S. Dharmapurikar, and A. Kabbani, "Counter braids: A novel counter architecture for per-fiow measurement," 
SIGMETRICS, June 2008. 

[15] M. Akcakaya, J. Park, and V. Tarokh, "Compressive sensing using low density frames," arXix, March 2009. [Online]. Available: 
^p://arxiv.org/abs/0903.0650 

[16] T. J. Richardson and R. L. Urbanke, "The capacity of low-density parity-check codes under message-passing decoding," IEEE Transactions on Information 
Theory, vol. 47, pp. 599-618, 2001. 

[17] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman, "Efficient erasure connecting codes," IEEE Transaction on Information Theory, 
vol. 47, pp. 569-584, 2001. 

[18] B. D. McKay, N. C. Wormald, and B. Wysocka, "Short cycles in random regular graphs," The Electronic Journal of Combinatorics, vol. 11, September 
2004. 



Appendix A 

Detailed Description of the Analysis Framework 

B.l. General Setup 

To derive the formulation for the general framework, we assume that we are at iteration I. The state of the system at this 

iteration is fully characterized by the set /C'^' and probabilities p'^^-, P'^f ' ^"d a^^\ 

The probabilities pj^^ denote the probability of a check node having i connected edges to the set 1C^^\ 

The probabilities p''^, denote the probability of a variable node in the set K.^^^ having i connected edges to the set 

The probability a'^-' denotes the probability of a variable node belonging to the set KS^\ 

Throughout the analysis, the head and tail of an edge e will be denoted by /ig and t^, respectively. As the direction of edges is of 
no consequence to our analysis, without loss of generality, we assign the head to the variable side and the tail to the check side. 



B.l. Derivation of Formulas 

To find the probability that a variable node is resolved, we first need to characterize the set of variable nodes resolved by each 
algorithm. A careful inspection of the iterative algorithms under consideration and based on Theorem|2]in section|V] in general, 
the variable nodes in the set Tlx — {-^/S U Xp+i U • ■ • U } are recovered and those in the set 7^^ = {Aq U A'l U ■ • • U A'^-i} 
are left intact, where the value (3 depends on the algorithm. 
Thus, the probability Pr^^ of a variable node in /C*-^-* being recovered is: 

Therefore, according to the total probability theorem, the probability of a variable node v remaining unresolved, i.e., v G /C'^^"'"^\ 
is: 



{l--P\^) 



When a variable node is recovered, its edges along with the variable node itself are removed from the subgraph induced by 
/C^^' and therefore, check nodes incident to these removed edges would face a reduction in their degree. We denote by pj^' 

the probability that a check node c turns from degree i in iteration I (i.e. c G N^^'^) to degree j < i in iteration £ + 1 (i.e. 
c G A/^^^^^^). This happens if out of i edges emanating from c and incident to the set of unresolved variable nodes 1C^^\ i — j 
of them are removed. 

On the other side of the graph, when a variable node u G is recovered (i.e., Xi C TZx), by definition, out of dy edges 
emanating from v, i are connected to the set TVi and — i are connected to the set TZj^f = {7V2 U A/3 U • ■ • UAfd^}- 
In the asymptotic case, as n grows large, we assume that the graph has a random structure in every iteration. Therefore, for 
each recovered variable node v, the set of i and d^ — i removed edges are distributed uniformly with respect to the check 
nodes in Mi and TZj^f, respectively. As we have two sets TVi and Tij\f to deal with, we differentiate between and pj^', 

it) (£) (^+1) 

{i > 1). Once the probabilities pjy^^ and pj^, are found, the new check node degree distribution pj^ ' with respect to the 
subgraph induced by A^^^+^^ can then be derived using the total probabiUty law: 

^AG ~ 2^ P^r^P^r^, ' J - u, • • • , Oc- 

To find the probabilities p\f[^^ and p)^ , i > 1, we denote by Pd=i and Pd>i the conditional probabilities that an edge in the 
induced subgraph is removed given that it is incident to a check node in the set A/i and TZj^, respectively. It then follows that: 



and 



The probability P]^li can then be calculated as follows: 

y 

Y Prjte G M'^\he G A-f \ fee G Pr[fee G X^^'^jK G /C^^)] ^ ±^ ^ 



iP 



Xi 



where, p^^^ is the probability of an edge e being adjacent to a check node in M[^^ conditioned on the fact that it is adjacent 
to a variable node in K,^^^ (refer to Figure [2]). By using Bayes' rule, this probability is calculated as: 



The probability P)iyi can be computed following similar steps: 
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E 
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(i-Pr[tee<^|/i, eA'f\/ieG/cW]) 






(l-Pr[teGAAf^|/ieG/CW]) 





E E f 



(^) 



1-p 



E 
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(2) 



Given p^j^, ' , the updated set /C(^+^) should be re-partitioned into the sets XI ' . By definition, a variable node z; in Xi has 
i connections to M\ and c?„ — i connections to the set Ttj^. Therefore, if one of the adjacent check nodes of v in v!^j^ turns 
to a check node in , V will move from xf'^ to x\\^^ . This is shown in Figure |5] 

We denote by A^^^^'^ the set of check nodes that move from T^^ to TVf^^'. The configuration of TV^^'*'^', M^^ and Mf^^ 




Fig. 5: A variable node in Xi turns to a variable node in Xi-\-\ when 
all but one of the edges connected to its adjacent check node are 
removed from the induced subgraph. 
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Fig. 6 


: Configuration of Afi after 


recovery. 



is depicted in Figure |6] 

We also refer to the set of edges that have their tail in the set TCj^J as free edges. Due to the random structure of the graph 
assumed in the asymptotic case, edges connected to the set Af^ are uniformly distributed with respect to free edges. 



It can be seen that the probability P^^^ defined as the probability of a variable node v G X^'^' turning to w G X'"^'^^' is 
calculated by: 

V% = Pr[. G G X^\v G KS^^-\ = ^'j/) [v^^^y (x^V^!^)"'-' , j = ■ • • , d„, z = 0, • - ■ , /3 - 1. 

where /3 is the algorithm dependent parameter defined in conjunction with the set IZx, and V^^^ is defined as the probability 
that a free edge corresponds to the set Mf^^ . Note that such an edge will have its head in the set Vf^ because the set Tlx is 
completely resolved. 



.(f+i) 



Thus, based on the total probability law we have: 

min{j,/3-l} 

1 - pr 

The denominator is just a normalization factor making P^"'^^'' a valid probability measure. It is derived as: 

The probability Pi is calculated as follows: 

= . (5) 



^r+E^^r^ 

1=2 



where, 

This probability will be inserted in ([3]) for the calculation of P^^\- 

B.3. Pre-phase Iteration for Genie, XH and SBB Algorithms 

The initial density factor is denoted by a^°^. In a random graph, an edge emanates from a variable node in the set /C^"'' with 
probability a^^'. Therefore, P^^., the probability of a check node being in the set NI'^\ is given by the following binomial 
distribution. 

To find the probability p|^', we need the probability defined and calculated in equation (|2]l. Knowing the probability 
P^^' will follow a binomial distribution as follows: 



B.4. Pre-phase Iterations for LM Algorithm 

In this section we drop all the superscripts representing the iteration number for the ease of notation. It will be introduced 
when there is a potential ambiguity. 

Starting from the initial density factor a'^°-', probability pj^^ of check degree distribution in the induced subgraph by the set 
/C'°^ can be calculated from (|6]l. 

In the first iteration of the LM algorithm, the variable nodes adjacent to at least one zero-valued check node are set to zero. 
The set of remaining variable nodes are called potential support set and are denoted by /C'. This set is a combination of the 
real support set JC and an additional set /Ca; The set of all zero-valued variable nodes that have dy connections to the nonzero 
check nodes M-lq- The probability Px;^ that a variable node belongs to the set /Ca is calculated as: 

p ^ pd^ 
K-A ^A ■ 

where. Pa is the probability that an edge from zero-valued variable nodes terminates in A/^o- This probability is: 

Pa = Pr[te e N^q\K iK] = l- Pr[te e i K], 
^ ^ Pr[/ie i /C|te e Mo\ Pr[te £ A/'o] ^ ^ 



Pr[/ie^/C] I- a 

= 1 - {I - a)''^-\ 



In this algorithm we have to group the check nodes based on the number of connections they have to the potential support 
set, rather than the original support set. This brings sets, denoted by N^, into play that reflect the effect of /Ca on the degree 



distribution of check nodes. As /Ca does not change the size of M^o, to calculate the probability of each subset A/J, we 
calculate the probability of each set Mi as before and then account for the effect of /Ca on changing the degrees. This process 
can be seen in Figure [7] With an abuse of notation, we will denote by P^f^- the probability that a check nodes is transfered 




Fig. 7: Transition from M to A/"' 



from Afi to M'y This probability can be calculated as follows: 

Pn., = Q J^') {pT' (1 - P't^' , * = 1, • • • , 4, J =*,•••, 4. 

where, p' is the probability that a free edge from N^q goes to /Ca and is calculated as; 

Pr[te e J^M^e e /Ca] Pr[K G /Ca |4 ^ A:] _ 1 x 



p' ^ Pr[he e K.A\te e M^o.K i /C] 



Pr[te e A/'^^ol/ie ^ /C] 



Thus, 



v., 



i = 1, • • • ,4 



prf„-i 



Variable nodes in /Ca are not connected to the set M^. Thus, the support set /C is divided into Xi according to equation Q. 
The only difference is that should be used instead of pj^' in the calculation of p'^^\ Also, variable nodes in /Ca will 



contribute to XL This means that: 



In this algorithm, like the Genie, /3 = 1 and therefore: 



v(l) 



{l~Pr). 



The pre-phase in the LM algorithm has two steps before we can use the general formulation presented in section |V] To find 
out the probabilities P^^, we use the intermediate probabilities Pj^f, to denote the probability that a check node goes from 



TVj Afj ■ The complication is because in the first iteration all the released edges come from /C'"-' and should not affect 

the connections that AAj s have with ICa- The only way to deal with this case is to go back to Afij. The general picture is 
depicted in Figure |8] Note that: 

1) Afij has j — i connections to /Ca, i connections to /C and dc — j free edges. 

2) Check nodes may change from J\f^°^ to AAq^^' when there axe. i — q connections to /C; i.e., j > q. 

Thus, to go from we need to go from A/'/'^'' to intermediate Af/^\ where j — q < i < j, and then from AA^ 

to 7V/'' 



(1) 



Thus, the overall formula would be: 
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Fig. 8: Evolution of due to ii'A and the resolution of variable nodes in the first iteration 



where is as follows: 



Pf = Pr[/ie e Xr\te e 7V„, K G K.] 

PT[te G J\fu\he G /le G K] Pr[he £ AVj/le £ K] 



( 



Pr[ie G A/',|/le G K] 

\ 



1 - 



V 



1 - p(o) 

l-p(O) 

where Mu = {{M U M U • • ■UMaJW'i'^"''}. 

From this point forth, the formulation presented in the general framework can be used with the probabilities px' and p^' 
replacing px and p^/. 



